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Abstract Body 


Background / Context: 

Although research on the effectiveness of differentiated and enriched instruction in 
improving the achievement of diverse students is still emerging, some studies (Beecher & 
Sweeney, 2008; Brimijoin, 2001; Gavin & Casa, 2012; Tieso, 2002; Tomlinson, Brimijoin, & 
Narvaez, 2008) suggest that students in academically diverse classrooms benefited academically 
from differentiated learning experiences. Brighton, Hertberg, Moon, Tomlinson, and Callahan 
(2005) found modest improvements in all content areas for middle school students involved in 
differentiated instruction and assessment. Recently, Reis, McCoach, Little, Muller, and Kaniskan 
(201 1) found improvement in reading for one suburban district and oral reading fluency for one 
suburban school. Additionally, both oral reading fluency and reading comprehension were higher 
in the treatment group in one low-SES urban school. More research is clearly warranted to assess 
the effectiveness of differentiated and enriched instruction and enriched curricula. 

Purpose / Objective / Research Question / Focus of Study: 

The primary research question was “What is the impact of implementing the pre- 
differentiated mathematics curricula in algebra, geometry and measurement, and graphing and 
data analysis on the achievement of grade 3 students, after controlling for pretest achievement 
scores?” Specifically, we were interested in examining whether math achievement outcomes of 
treatment and control group students differed. 

Setting: 

The study included 42 public schools, and one private school in 12 states, with the 
majority from rural setting and 3 schools within a large city. Nine schools had more than 20% 
non-White/non-Asian student enrollment and 5 schools had more than 30% non-White/non- 
Asian student enrollment. Free- and reduced-priced meal eligibility for students at these schools 
ranged from 0 to 68%. In both groups, teachers were predominantly female and White. Both 
treatment and control teachers had similar characteristics: over 57% had 10 or more years 
teaching experience; a majority had less than 10 years of experience with grade 3 students; and 
over 56% had master’s degrees. 

Population / Participants / Subjects: 

The number of treatment and control students in the final analytic sample was 2290. Of 
the students in the analytic sample, a similar percentage of males (50%) and females (49%) 
comprised the treatment and control groups across all schools. Over 80% of students in the 
treatment and control groups were White, with fewer than 20% representing other racial/ethnic 
groups. 

Intervention / Program / Practice: 

This study compared researcher-designed, pre-differentiated and enriched mathematics 
curricula in algebra, geometry, and measurement, graphing, and data analysis to the districts’ 
mathematics curricula. 

Three widely adopted models in gifted and talented education place the teacher in the role 
of knowledge broker, facilitator, and guide, emphasizing differentiation of curricula in general 
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education classrooms as well as in pull-out and special classes designed for identified gifted and 
talented students. Elements of these models were combined and utilized to develop the current 
study’s mathematics units: Differentiation of Instruction Model (Tomlinson, 2001); Depth and 
Complexity Model (Kaplan, 2009); and Schoolwide Enrichment Model (Renzulli & Reis, 1997). 

Using pre-assessments accompanying each unit, teachers were guided in their selection of 
differentiated lesson options, based on the same challenging concepts, appropriate for each 
student. For most of the units’ lessons, three levels of scaffolding were embedded in the lessons’ 
activities. This form of differentiating by students’ demonstrated prior knowledge — often known 
as differentiation by readiness (Tomlinson, 2001, Tomlinson & Jarvis, 2009) — is referred to as 
“tiering.” Tiered activities (Adams & Pierce, 2006; Tomlinson, 2001) function to lead students 
with different levels of initial knowledge and skills to master a similar “big idea” objective 
through adjustment of such aspects of the assignment as simplicity/complexity, 
concreteness/abstractness, more structure/less structure, etc. (Tomlinson, 2001). Treatment 
teachers participated in 2 days of onsite professional development, completed teacher logs upon 
completion of each unit, and research team members maintained weekly contact with treatment 
teachers. Over 90% of treatment teachers completed all three units. 

Research Design: 

This multisite cluster randomized control trial randomly assigned 141 general education 
classrooms (teachers) within 43 schools across 12 states to treatment or control conditions. 
Treatment teachers were required to implement three curricular units, which would supplant the 
district’s adopted mathematics curricula for 16 weeks. Control group teachers continued with the 
district’s adopted mathematics curricula or “business as usual.” Of the 141 teachers, 84 were 
assigned to the treatment condition; 57 assigned to the control condition. In two instances of co- 
teaching, the co-teachers were assigned to condition as a single unit. 

Cluster-level randomization was selected for “good practical and scientific reasons” 
(Shadish, Cook, & Campbell, 2002, p. 254). On a practical level, participant recruitment required 
support from school administrators, for which the cluster-level design was pragmatically suited. 
Scientifically, we hoped to answer questions about real students in real classrooms for whom the 
layers of clustered data provide nuanced estimates of outcomes. 

During the spring prior to the intervention, grade 2 students in the participating schools 
completed the Level 8 Math Problems subtest or another nationally standardized achievement 
test. All pretest measures of ability and achievement were aligned using the equipercentile 
method in which scale scores were converted to z-scores for comparability. Pretest math 
achievement scores were used as a covariate in the resulting analyses. After the curricular 
implementation was complete, treatment and control students took the Level 9 ITBS Math 
Problem Solving and Data Interpretation subtest as a posttest achievement measure — the 
dependent variable for the 3-level analyses. 

Data Collection and Analysis: 


Assessments/Measures 

Teachers administered one mathematics subtest of the Iowa Tests of Basic Skills (ITBS) 
to the treatment and control students. The ITBS test content is aligned with the most current 
content standards, curriculum frameworks, and instructional materials. The test was standardized 
on a national sample of students K-9, with approximately 3,000 students per level per form 
completing the tests. Internal consistency estimates using KR 20 varied between .79 and .98. 
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Students in the standardization sample represented various types of communities, ethnicity, race, 
and socioeconomic status. The standardization sample included public, parochial, and non- 
parochial schools. Schools in the standardization were further stratified by socioeconomic status. 
Data from these sources were used to develop special norms for a variety of groups (e.g., 
race/ethnicity, public school) (Hoover et al., 2003). 

The ITBS Level 8 Math Problems subtest was administered to grade 2 students prior to 
the curricular intervention to obtain information on students’ achievement in mathematics. The 
Level 8 ITBS subtest had 30 items. A small proportion of students completed other mathematics 
achievement pretests (the TerraNova, the Measure of Academic Progress [MAP], or the Stanford 
Achievement Test [SAT]). Because the achievement tests were on different scales, z-scores for 
the scores on each of the four achievement tests were calculated so that students’ pretest 
achievement could be compared across tests. 

Analyses 

To examine the effects of the differentiated curricula, we first ran a series of 3-level 
regression models using HLM 7.0 software (Raudenbush, Bryk, & Congdon, 2011). At level 1, 
we included pre-ITBS score, which was grand mean centered, and “gifted” status, which was 
defined as students with Cog AT composite IQ scores in the top 10% of their respective schools. 
At level 2, we included treatment, which was dummy coded, so that 0 represented a control 
classroom and 1 represented a treatment classroom. At level 3, we controlled for the school mean 
achievement by creating an aggregate of each school’s second grade math pretest score. School 
aggregate math score was also a z-score. Because the ITBS scores exhibited a ceiling effect, the 
data were reanalyzed using a multilevel Tobit model, which accounted for the censored nature of 
the data. The results of the two analyses were quite similar and led to identical conclusions 
about treatment effectiveness. Table 1 contains the results of the final results from the 3-level 
multilevel analysis in HLM and the two level multilevel Tobit analyses with corrected standard 
errors in MPLUS 6. MPLUS 7 now allows for three level organizational analyses, so the data 
will be rerun using a 3-level Tobit model in MPLUS prior to the presentation, but given the 
similarity between the current analyses, we do not expect the results to change appreciably. 

Findings / Results: 

The final model failed to show a main effect for treatment, but did uncover interesting 
cross-level interaction effects. Examining Model 3, although there was no statistically significant 
difference between treatment and control groups when school aggregate pre-ITBS was held 
constant, there was a statistically significant effect of treatment on the pre-ITBS slope, that is, on 
the effect of pre-ITBS on post-ITBS. The effect of pre-ITBS on post-ITBS was stronger in 
treatment classes than in control classes, indicating that the treatment appeared to have a 
differentiating effect on students. 

However, the picture is even more complex. The school aggregate pre-ITBS score 
moderated the cross-level interaction between treatment and pretest score. In schools with lower 
pre-ITBS scores, the treatment slope was steeper than the control slope; in higher aggregate pre- 
ITBS schools, this effect was reversed. These 3-way interaction effects are most easily 
understood graphically. Therefore, Figures 1, 2, and 3 illustrate the relationship between pre- 
ITBS and post-ITBS scores in low aggregate pre-ITBS schools, high aggregate pre-ITBS 
schools, and average aggregate pre-ITBS schools. In average aggregate pre-ITBS schools, there 
appears to be no discemable treatment effect, based on the final HLM models. In low pre-ITBS 
schools, students with higher pretest scores do better in the treatment group, and students with 
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lower pretest scores do better in the control group. In high pre-ITBS schools, students with lower 
pretest scores do better in the treatment group, and students with high pre-ITBS scores appear to 
do equally well in either group. See Table 1 for the results of the analyses. 

To illustrate this in another manner, we broke the group into four groups, based on their 
relative pretest levels. Table 2 shows the differences between the treatment and control groups 
disaggregated by their relative standing within their schools, based on their standardized pretest 
scores. The treatment effect was negligible for average and low achievers. However, there was a 
difference of .41 standard deviation units, favoring the treatment for the highest achievers. These 
results suggest that differentiated instruction may be most effective for the highest achievers in a 
school. This effect was likely strongest for the highest achievers in the lower achieving schools 
due to the observed ceiling effects on the post ITBS. 

Conclusions: 

In general, the post-ITBS scores of students in the treatment group were equal to those in the 
control group. However, high achieving students did appear to derive some benefit from the 
differentiated curricula. This was especially true for high achieving students in lower achieving 
schools. Several conclusions can be posited: 

1. The ceiling on the norm-referenced test was not high enough to record students’ true 
level of content, concepts, and skills mastered in problem solving and data interpretation. 

2. The norm-referenced ITBS was not a good match to content in the algebra and geometry 
and measurement units. 

3. The lack of a main effect illustrated that eliminating 16 weeks of the “business as usual” 
curricula for the treatment group students did not have a negative impact on students 
involved in the intervention. 

4. The curricula benefited students differentially depending on the achievement status of 
their schools and their designation as treatment group or control group students. 

We were able to replace grade level curriculum with more challenging and enriching 
curriculum without negatively impacting standardized test scores. In the current age of increased 
accountability, teachers are often afraid to stray from the mainstream curriculum for fear of 
jeopardizing their state test scores. Assuming the ITBS posttest measures the typical grade 3 
mathematics curricula, the current study provides some evidence that teachers can replace typical 
at-grade level curriculum with more challenging, enriched mathematics curriculum without 
suffering adverse consequences on standardized assessments. Viewed through this lens, the 
results of this study should encourage teachers to consider stepping out of the lock-step 
curriculum to differentiate their math curriculum. 

The measurement issues that plagued this study (i.e.- the low ceiling on the ITBS, the lack of 
alignment between the ITBS and the differentiated units) are a major limitation. Future research 
should explore the differentiated units using different post-assessments, including the researcher 
developed curriculum based measures for both the treatment and control groups and utilizing and 
out of grade level assessments would have provided a clearer picture of the effects of the 
intervention. 
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Appendix B. Tables and Figures 

Table 1 

Results of Multilevel Analyses of the Treatment Effect Using a Tobit Model 

HLM Model MPLUS MPLUS 

Coefficient Non- Censored 

(SE) censored 

Model for Intercept of post 
test score ((3ooj) 

Intercept 


Intercept (y 00 o) 

204.17*** 

205.32*** 

206.09*** 


(1.08) 

(.92) 

(.99) 

Mean school 

2.90 

3.30 

4.43 

pretest (y 00 i) 

(3.11) 

(3.06) 

(3.34) 

Treatment 




Intercept (y 0 io) 

.81 

.51 

.87 


( 1 . 10 ) 

(.96) 

(1.08) 

Mean school 

1.12 

2.44 

1.76 

pretest (y 0 n) 

(3.25) 

(2.80) 

(3.23) 

Model for student achievement slope 




Intercept 




Intercept (vim) 

13.46*** 

13.31 

14.33 


(0.64) 

(.64) 

(.72) 

Mean school 

2.39 

2.01 

4.01* 

pretest (yioi) 

(1.81) 

(1.78) 

(1.82) 

Treatment 




Intercept (yno) 

2.28 

2.26** 

2.79** 


(0.821** 

(.751 

(.831 

Mean school 

-6.92** 

- 6 . 11 ** 

-6.77** 

pretest (vi i il 

(2.36) 

(2.16) 

(2.29) 

Model for gifted effect 




Intercept 




Intercept (y 2 oo) 

6 . 88 *** 

6.67*** 

9 89*** 


(1.27) 

(-91) 

(1.24) 

Variance 




Level 1 (between students) 




Var(e ijk ) 

245.89 

242.31 

303.04 


(7.84) 

(7.56) 

(12.62) 

Level 2 (between teachers) 




Var(r 0 jk)=xP 

17 07*** 

28.96 

34.93 


(4.88) 

(7.23) 

(8.83) 
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HLM Model 
Coefficient 
(SE) 

MPLUS 

Non- 

censored 

MPLUS 

Censored 

Level 3 (between schools) 
Var(uook) 

Goodness of fit 

AIC 

14.85*** 

(6.07) 

17596.6 

19073.7 

17580.8 

BIC 

17665.4 

19136.8 

17643.8 

Deviance 

17572.6 

19051.7 

17558.8 

Parameters 

12 

11 

11 


Table 2 


Mean Posttest Achievement of Treatment and Control Students in Four Categories of Pretest 
Achievement 

Experimental Pretest Achievement 


Condition 

Low 

Pretest 

Low-Average 

Pretest 

High-Average 

Pretest 

High 

Pretest 

Total 

Control 

Group N 

133 

301 

338 

127 

899 

Mean 

182.66 

200.4 

213.41 

222.60 

205.80 

Standard Deviation 

15.99 

19.13 

19.09 

16.46 

22.03 

Treatment 

Group N 

168 

523 

495 

205 

1391 

Mean 

181.34 

198.66 

214.18 

228.61 

206.51 

Standard Deviation 

17.37 

19.54 

19.00 

13.69 

22.98 

Effect Size 

Cohen’s d 

-.08 

-.09 

.04 

.41 

.03 
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Figure 1. Predicted values for students with a given math pretest score (X-axis) on final math 
posttest score (Y-axis) in schools that scored one standard deviation below the sample mean on 
pre math achievement. 
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Figure 2. Predicted values for students with a given math pretest score (X-axis) on final math 
posttest score (Y-axis) in schools that scored one standard deviation above the sample mean on 
pre math achievement. 


SREE Spring 2013 Conference Abstract Template 


B-3 



250 


200 


150 


100 



- 2.00 - 1.5 -1 - 0.5 0 0.5 1 1.5 2 


- - - - At the mean 
Treatment 

At the mean Control 


Figure 3. Predicted values for students with a given math pretest score (X-axis) on final math 
posttest score (Y-axis) in schools that scored at the sample mean on pre math achievement. 
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